Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 309117 |
| Missing cells | 49688 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 33.0 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Categorical | 4 |
|---|---|
| DateTime | 1 |
| Numeric | 9 |
VERSIE has constant value "1.0" | Constant |
DATUM_BESTAND has constant value "2022-08-26" | Constant |
PEILDATUM has constant value "2022-08-01" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1875 distinct values | High cardinality |
BEHANDELEND_SPECIALISME_CD is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with BEHANDELEND_SPECIALISME_CD and 1 other fields | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
VERSIE is highly correlated with DATUM_BESTAND and 1 other fields | High correlation |
DATUM_BESTAND is highly correlated with VERSIE and 1 other fields | High correlation |
PEILDATUM is highly correlated with VERSIE and 1 other fields | High correlation |
JAAR is highly correlated with AANTAL_PAT_PER_SPC and 1 other fields | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with JAAR and 1 other fields | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with JAAR and 1 other fields | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 49688 (16.1%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.41443709) | Skewed |
Reproduction
| Analysis started | 2022-09-06 19:04:41.717331 |
|---|---|
| Analysis finished | 2022-09-06 19:05:05.282810 |
| Duration | 23.57 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 927351 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 309117 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1.0 | 309117 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 309117 | |
| . | 309117 | |
| 0 | 309117 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 618234 | |
| Other Punctuation | 309117 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 309117 | |
| 0 | 309117 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 309117 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 927351 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 309117 | |
| . | 309117 | |
| 0 | 309117 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 927351 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 309117 | |
| . | 309117 | |
| 0 | 309117 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 2022-08-26 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3091170 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-08-26 |
|---|---|
| 2nd row | 2022-08-26 |
| 3rd row | 2022-08-26 |
| 4th row | 2022-08-26 |
| 5th row | 2022-08-26 |
Common Values
| Value | Count | Frequency (%) |
| 2022-08-26 | 309117 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2022-08-26 | 309117 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1236468 | |
| 0 | 618234 | |
| - | 618234 | |
| 8 | 309117 | 10.0% |
| 6 | 309117 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2472936 | |
| Dash Punctuation | 618234 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1236468 | |
| 0 | 618234 | |
| 8 | 309117 | 12.5% |
| 6 | 309117 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 618234 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3091170 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1236468 | |
| 0 | 618234 | |
| - | 618234 | |
| 8 | 309117 | 10.0% |
| 6 | 309117 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3091170 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1236468 | |
| 0 | 618234 | |
| - | 618234 | |
| 8 | 309117 | 10.0% |
| 6 | 309117 | 10.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 2022-08-01 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3091170 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-08-01 |
|---|---|
| 2nd row | 2022-08-01 |
| 3rd row | 2022-08-01 |
| 4th row | 2022-08-01 |
| 5th row | 2022-08-01 |
Common Values
| Value | Count | Frequency (%) |
| 2022-08-01 | 309117 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2022-08-01 | 309117 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 927351 | |
| 0 | 927351 | |
| - | 618234 | |
| 8 | 309117 | 10.0% |
| 1 | 309117 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2472936 | |
| Dash Punctuation | 618234 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 927351 | |
| 0 | 927351 | |
| 8 | 309117 | 12.5% |
| 1 | 309117 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 618234 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3091170 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 927351 | |
| 0 | 927351 | |
| - | 618234 | |
| 8 | 309117 | 10.0% |
| 1 | 309117 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3091170 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 927351 | |
| 0 | 927351 | |
| - | 618234 | |
| 8 | 309117 | 10.0% |
| 1 | 309117 | 10.0% |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2022-01-01 00:00:00 |
| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 435.2313299 |
| Minimum | 301 |
|---|---|
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 977.2946637 |
|---|---|
| Coefficient of variation (CV) | 2.24546028 |
| Kurtosis | 62.61042729 |
| Mean | 435.2313299 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 8.032311663 |
| Sum | 134537403 |
| Variance | 955104.8597 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 305 | 43554 | |
| 313 | 40271 | |
| 303 | 35594 | |
| 330 | 24609 | 8.0% |
| 316 | 20945 | 6.8% |
| 308 | 16266 | 5.3% |
| 306 | 12929 | 4.2% |
| 324 | 12752 | 4.1% |
| 301 | 12529 | 4.1% |
| 304 | 10098 | 3.3% |
| Other values (18) | 79570 |
| Value | Count | Frequency (%) |
| 301 | 12529 | 4.1% |
| 302 | 6807 | 2.2% |
| 303 | 35594 | |
| 304 | 10098 | 3.3% |
| 305 | 43554 | |
| 306 | 12929 | 4.2% |
| 307 | 5414 | 1.8% |
| 308 | 16266 | 5.3% |
| 310 | 3448 | 1.1% |
| 313 | 40271 |
| Value | Count | Frequency (%) |
| 8418 | 4207 | 1.4% |
| 8416 | 349 | 0.1% |
| 1900 | 204 | 0.1% |
| 390 | 843 | 0.3% |
| 389 | 3288 | 1.1% |
| 362 | 4209 | 1.4% |
| 361 | 2220 | 0.7% |
| 335 | 3145 | 1.0% |
| 330 | 24609 | |
| 329 | 826 | 0.3% |
| Distinct | 1875 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| 101 | 1315 |
|---|---|
| 402 | 1277 |
| 301 | 1245 |
| 403 | 1244 |
| 201 | 1173 |
| Other values (1870) |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.349608724 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1035421 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 33 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 11 |
|---|---|
| 2nd row | 12 |
| 3rd row | 12 |
| 4th row | 11 |
| 5th row | 13 |
Common Values
| Value | Count | Frequency (%) |
| 101 | 1315 | 0.4% |
| 402 | 1277 | 0.4% |
| 301 | 1245 | 0.4% |
| 403 | 1244 | 0.4% |
| 201 | 1173 | 0.4% |
| 203 | 1171 | 0.4% |
| 401 | 1045 | 0.3% |
| 404 | 1035 | 0.3% |
| 409 | 1009 | 0.3% |
| 802 | 996 | 0.3% |
| Other values (1865) | 297607 |
Length
| Value | Count | Frequency (%) |
| 101 | 1315 | 0.4% |
| 402 | 1277 | 0.4% |
| 301 | 1245 | 0.4% |
| 403 | 1244 | 0.4% |
| 201 | 1173 | 0.4% |
| 203 | 1171 | 0.4% |
| 401 | 1045 | 0.3% |
| 404 | 1035 | 0.3% |
| 409 | 1009 | 0.3% |
| 802 | 996 | 0.3% |
| Other values (1865) | 297607 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 198144 | |
| 0 | 189682 | |
| 2 | 137110 | |
| 3 | 112400 | |
| 5 | 79637 | |
| 9 | 74842 | 7.2% |
| 4 | 73733 | 7.1% |
| 7 | 60874 | 5.9% |
| 6 | 53990 | 5.2% |
| 8 | 44516 | 4.3% |
| Other values (15) | 10493 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1024928 | |
| Uppercase Letter | 10493 | 1.0% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 1976 | |
| M | 1746 | |
| B | 1267 | |
| E | 889 | |
| Z | 874 | |
| D | 698 | 6.7% |
| A | 682 | 6.5% |
| F | 662 | 6.3% |
| C | 345 | 3.3% |
| K | 342 | 3.3% |
| Other values (5) | 1012 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 198144 | |
| 0 | 189682 | |
| 2 | 137110 | |
| 3 | 112400 | |
| 5 | 79637 | |
| 9 | 74842 | 7.3% |
| 4 | 73733 | 7.2% |
| 7 | 60874 | 5.9% |
| 6 | 53990 | 5.3% |
| 8 | 44516 | 4.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1024928 | |
| Latin | 10493 | 1.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 1976 | |
| M | 1746 | |
| B | 1267 | |
| E | 889 | |
| Z | 874 | |
| D | 698 | 6.7% |
| A | 682 | 6.5% |
| F | 662 | 6.3% |
| C | 345 | 3.3% |
| K | 342 | 3.3% |
| Other values (5) | 1012 |
Common
| Value | Count | Frequency (%) |
| 1 | 198144 | |
| 0 | 189682 | |
| 2 | 137110 | |
| 3 | 112400 | |
| 5 | 79637 | |
| 9 | 74842 | 7.3% |
| 4 | 73733 | 7.2% |
| 7 | 60874 | 5.9% |
| 6 | 53990 | 5.3% |
| 8 | 44516 | 4.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1035421 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 198144 | |
| 0 | 189682 | |
| 2 | 137110 | |
| 3 | 112400 | |
| 5 | 79637 | |
| 9 | 74842 | 7.2% |
| 4 | 73733 | 7.1% |
| 7 | 60874 | 5.9% |
| 6 | 53990 | 5.2% |
| 8 | 44516 | 4.3% |
| Other values (15) | 10493 | 1.0% |
ZORGPRODUCT_CD
Real number (ℝ≥0)
| Distinct | 5987 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 439281596 |
| Minimum | 10501002 |
|---|---|
| Maximum | 998418081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 10501002 |
|---|---|
| 5-th percentile | 28999037 |
| Q1 | 99799028 |
| median | 149599019 |
| Q3 | 990004004 |
| 95-th percentile | 990516041 |
| Maximum | 998418081 |
| Range | 987917079 |
| Interquartile range (IQR) | 890204976 |
Descriptive statistics
| Standard deviation | 428715106.4 |
|---|---|
| Coefficient of variation (CV) | 0.9759459769 |
| Kurtosis | -1.730415344 |
| Mean | 439281596 |
| Median Absolute Deviation (MAD) | 119600010 |
| Skewness | 0.4747735254 |
| Sum | 1.357894091 × 1014 |
| Variance | 1.837966424 × 1017 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 990004009 | 2295 | 0.7% |
| 990004007 | 2239 | 0.7% |
| 990003004 | 2153 | 0.7% |
| 990004006 | 1782 | 0.6% |
| 990356076 | 1638 | 0.5% |
| 990356073 | 1489 | 0.5% |
| 131999228 | 1458 | 0.5% |
| 131999164 | 1433 | 0.5% |
| 990003007 | 1413 | 0.5% |
| 199299013 | 1299 | 0.4% |
| Other values (5977) | 291918 |
| Value | Count | Frequency (%) |
| 10501002 | 8 | |
| 10501003 | 11 | |
| 10501004 | 11 | |
| 10501005 | 11 | |
| 10501007 | 3 | < 0.1% |
| 10501008 | 11 | |
| 10501010 | 11 | |
| 10501011 | 3 | < 0.1% |
| 11101002 | 9 | |
| 11101003 | 11 |
| Value | Count | Frequency (%) |
| 998418081 | 151 | |
| 998418080 | 134 | |
| 998418079 | 38 | < 0.1% |
| 998418077 | 8 | < 0.1% |
| 998418076 | 8 | < 0.1% |
| 998418075 | 6 | < 0.1% |
| 998418074 | 213 | |
| 998418073 | 212 | |
| 998418072 | 8 | < 0.1% |
| 998418071 | 8 | < 0.1% |
AANTAL_PAT_PER_ZPD
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9757 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 502.6605363 |
| Minimum | 1 |
|---|---|
| Maximum | 163220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 13 |
| Q3 | 99 |
| 95-th percentile | 1689 |
| Maximum | 163220 |
| Range | 163219 |
| Interquartile range (IQR) | 96 |
Descriptive statistics
| Standard deviation | 3135.177096 |
|---|---|
| Coefficient of variation (CV) | 6.237165779 |
| Kurtosis | 407.1125908 |
| Mean | 502.6605363 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 16.7580401 |
| Sum | 155380917 |
| Variance | 9829335.421 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 51659 | 16.7% |
| 2 | 25244 | 8.2% |
| 3 | 16472 | 5.3% |
| 4 | 12092 | 3.9% |
| 5 | 9420 | 3.0% |
| 6 | 7947 | 2.6% |
| 7 | 6566 | 2.1% |
| 8 | 5589 | 1.8% |
| 9 | 5136 | 1.7% |
| 10 | 4497 | 1.5% |
| Other values (9747) | 164495 |
| Value | Count | Frequency (%) |
| 1 | 51659 | |
| 2 | 25244 | |
| 3 | 16472 | 5.3% |
| 4 | 12092 | 3.9% |
| 5 | 9420 | 3.0% |
| 6 | 7947 | 2.6% |
| 7 | 6566 | 2.1% |
| 8 | 5589 | 1.8% |
| 9 | 5136 | 1.7% |
| 10 | 4497 | 1.5% |
| Value | Count | Frequency (%) |
| 163220 | 1 | |
| 154447 | 1 | |
| 152994 | 1 | |
| 150670 | 1 | |
| 147173 | 1 | |
| 143739 | 1 | |
| 116894 | 1 | |
| 114796 | 1 | |
| 109513 | 1 | |
| 108720 | 1 |
AANTAL_SUBTRAJECT_PER_ZPD
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 10479 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 593.7468337 |
| Minimum | 1 |
|---|---|
| Maximum | 239696 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 108 |
| 95-th percentile | 1925 |
| Maximum | 239696 |
| Range | 239695 |
| Interquartile range (IQR) | 105 |
Descriptive statistics
| Standard deviation | 4027.55619 |
|---|---|
| Coefficient of variation (CV) | 6.783288704 |
| Kurtosis | 729.5269347 |
| Mean | 593.7468337 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 21.41443709 |
| Sum | 183537240 |
| Variance | 16221208.87 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 49785 | 16.1% |
| 2 | 24836 | 8.0% |
| 3 | 16305 | 5.3% |
| 4 | 11877 | 3.8% |
| 5 | 9347 | 3.0% |
| 6 | 7939 | 2.6% |
| 7 | 6474 | 2.1% |
| 8 | 5593 | 1.8% |
| 9 | 5085 | 1.6% |
| 10 | 4505 | 1.5% |
| Other values (10469) | 167371 |
| Value | Count | Frequency (%) |
| 1 | 49785 | |
| 2 | 24836 | |
| 3 | 16305 | 5.3% |
| 4 | 11877 | 3.8% |
| 5 | 9347 | 3.0% |
| 6 | 7939 | 2.6% |
| 7 | 6474 | 2.1% |
| 8 | 5593 | 1.8% |
| 9 | 5085 | 1.6% |
| 10 | 4505 | 1.5% |
| Value | Count | Frequency (%) |
| 239696 | 1 | |
| 231824 | 1 | |
| 230884 | 1 | |
| 226881 | 1 | |
| 226322 | 1 | |
| 224418 | 1 | |
| 222025 | 1 | |
| 218435 | 1 | |
| 212567 | 1 | |
| 212442 | 1 |
AANTAL_PAT_PER_DIAG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 8682 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7526.39236 |
| Minimum | 1 |
|---|---|
| Maximum | 225597 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 37 |
| Q1 | 375 |
| median | 1640 |
| Q3 | 6109 |
| 95-th percentile | 36289 |
| Maximum | 225597 |
| Range | 225596 |
| Interquartile range (IQR) | 5734 |
Descriptive statistics
| Standard deviation | 17666.07277 |
|---|---|
| Coefficient of variation (CV) | 2.347216559 |
| Kurtosis | 34.39923292 |
| Mean | 7526.39236 |
| Median Absolute Deviation (MAD) | 1504 |
| Skewness | 5.093200147 |
| Sum | 2326535827 |
| Variance | 312090127.3 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21 | 586 | 0.2% |
| 19 | 537 | 0.2% |
| 8 | 517 | 0.2% |
| 9 | 509 | 0.2% |
| 4 | 503 | 0.2% |
| 17 | 481 | 0.2% |
| 2 | 473 | 0.2% |
| 12 | 469 | 0.2% |
| 36 | 466 | 0.2% |
| 7 | 461 | 0.1% |
| Other values (8672) | 304115 |
| Value | Count | Frequency (%) |
| 1 | 444 | |
| 2 | 473 | |
| 3 | 458 | |
| 4 | 503 | |
| 5 | 446 | |
| 6 | 442 | |
| 7 | 461 | |
| 8 | 517 | |
| 9 | 509 | |
| 10 | 353 |
| Value | Count | Frequency (%) |
| 225597 | 23 | |
| 212781 | 24 | |
| 212413 | 23 | |
| 211620 | 17 | |
| 211573 | 25 | |
| 209399 | 17 | |
| 208745 | 19 | |
| 203476 | 17 | |
| 199113 | 16 | |
| 197189 | 20 |
AANTAL_SUBTRAJECT_PER_DIAG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9716 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10841.23361 |
| Minimum | 1 |
|---|---|
| Maximum | 365524 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 46 |
| Q1 | 493 |
| median | 2272 |
| Q3 | 8788 |
| 95-th percentile | 51237 |
| Maximum | 365524 |
| Range | 365523 |
| Interquartile range (IQR) | 8295 |
Descriptive statistics
| Standard deviation | 26296.44824 |
|---|---|
| Coefficient of variation (CV) | 2.425595572 |
| Kurtosis | 37.95431952 |
| Mean | 10841.23361 |
| Median Absolute Deviation (MAD) | 2096 |
| Skewness | 5.335213527 |
| Sum | 3351209610 |
| Variance | 691503189.9 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17 | 434 | 0.1% |
| 4 | 415 | 0.1% |
| 19 | 405 | 0.1% |
| 38 | 401 | 0.1% |
| 13 | 399 | 0.1% |
| 3 | 399 | 0.1% |
| 2 | 397 | 0.1% |
| 7 | 395 | 0.1% |
| 23 | 382 | 0.1% |
| 6 | 382 | 0.1% |
| Other values (9706) | 305108 |
| Value | Count | Frequency (%) |
| 1 | 374 | |
| 2 | 397 | |
| 3 | 399 | |
| 4 | 415 | |
| 5 | 380 | |
| 6 | 382 | |
| 7 | 395 | |
| 8 | 355 | |
| 9 | 337 | |
| 10 | 350 |
| Value | Count | Frequency (%) |
| 365524 | 23 | |
| 345464 | 25 | |
| 338983 | 19 | |
| 334734 | 24 | |
| 327073 | 23 | |
| 321625 | 20 | |
| 311516 | 17 | |
| 307543 | 17 | |
| 295898 | 17 | |
| 286829 | 16 |
AANTAL_PAT_PER_SPC
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 297 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 656097.5372 |
| Minimum | 361 |
|---|---|
| Maximum | 1479231 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 361 |
|---|---|
| 5-th percentile | 39316 |
| Q1 | 246823 |
| median | 736514 |
| Q3 | 991298 |
| 95-th percentile | 1319370 |
| Maximum | 1479231 |
| Range | 1478870 |
| Interquartile range (IQR) | 744475 |
Descriptive statistics
| Standard deviation | 420363.3094 |
|---|---|
| Coefficient of variation (CV) | 0.640702465 |
| Kurtosis | -1.180010047 |
| Mean | 656097.5372 |
| Median Absolute Deviation (MAD) | 318069 |
| Skewness | -0.007527202955 |
| Sum | 2.028109024 × 1011 |
| Variance | 1.767053119 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 874172 | 5096 | 1.6% |
| 864705 | 4344 | 1.4% |
| 836189 | 4341 | 1.4% |
| 883510 | 4317 | 1.4% |
| 868158 | 4254 | 1.4% |
| 882778 | 4203 | 1.4% |
| 749697 | 4079 | 1.3% |
| 724477 | 3918 | 1.3% |
| 1073256 | 3886 | 1.3% |
| 1088259 | 3860 | 1.2% |
| Other values (287) | 266819 |
| Value | Count | Frequency (%) |
| 361 | 54 | < 0.1% |
| 1596 | 130 | |
| 1673 | 135 | |
| 1680 | 212 | |
| 1892 | 131 | |
| 1972 | 67 | < 0.1% |
| 2120 | 173 | |
| 2362 | 77 | < 0.1% |
| 2462 | 173 | |
| 2711 | 252 |
| Value | Count | Frequency (%) |
| 1479231 | 2971 | |
| 1441760 | 3044 | |
| 1412476 | 3561 | |
| 1328975 | 3534 | |
| 1319370 | 3545 | |
| 1317998 | 3432 | |
| 1298963 | 3460 | |
| 1271973 | 3572 | |
| 1258693 | 1177 | 0.4% |
| 1255889 | 1201 | 0.4% |
AANTAL_SUBTRAJECT_PER_SPC
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 297 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1059258.375 |
| Minimum | 369 |
|---|---|
| Maximum | 2637767 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 369 |
|---|---|
| 5-th percentile | 46355 |
| Q1 | 360758 |
| median | 1062980 |
| Q3 | 1714002 |
| 95-th percentile | 2470320 |
| Maximum | 2637767 |
| Range | 2637398 |
| Interquartile range (IQR) | 1353244 |
Descriptive statistics
| Standard deviation | 744554.9739 |
|---|---|
| Coefficient of variation (CV) | 0.7029021353 |
| Kurtosis | -0.8817237113 |
| Mean | 1059258.375 |
| Median Absolute Deviation (MAD) | 701200 |
| Skewness | 0.3273051977 |
| Sum | 3.274347712 × 1011 |
| Variance | 5.543621092 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1202049 | 5096 | 1.6% |
| 1267242 | 4344 | 1.4% |
| 1204843 | 4341 | 1.4% |
| 1299156 | 4317 | 1.4% |
| 1282060 | 4254 | 1.4% |
| 1318735 | 4203 | 1.4% |
| 1129613 | 4079 | 1.3% |
| 1062980 | 3918 | 1.3% |
| 2534951 | 3886 | 1.3% |
| 2637767 | 3860 | 1.2% |
| Other values (287) | 266819 |
| Value | Count | Frequency (%) |
| 369 | 54 | < 0.1% |
| 1803 | 212 | |
| 1847 | 130 | |
| 1937 | 135 | |
| 2028 | 67 | < 0.1% |
| 2167 | 131 | |
| 2362 | 77 | < 0.1% |
| 2715 | 252 | |
| 2779 | 173 | |
| 2852 | 173 |
| Value | Count | Frequency (%) |
| 2637767 | 3860 | |
| 2575909 | 3842 | |
| 2554272 | 3768 | |
| 2534951 | 3886 | |
| 2470320 | 3850 | |
| 2371642 | 3713 | |
| 2171234 | 3754 | |
| 2056831 | 3809 | |
| 2024740 | 1169 | 0.4% |
| 1970612 | 1165 | 0.4% |
| Distinct | 3410 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 49688 |
| Missing (%) | 16.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3519.708918 |
| Minimum | 70 |
|---|---|
| Maximum | 287010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.4 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 465 |
| median | 1215 |
| Q3 | 4065 |
| 95-th percentile | 13410 |
| Maximum | 287010 |
| Range | 286940 |
| Interquartile range (IQR) | 3600 |
Descriptive statistics
| Standard deviation | 6500.98308 |
|---|---|
| Coefficient of variation (CV) | 1.84702293 |
| Kurtosis | 154.0289247 |
| Mean | 3519.708918 |
| Median Absolute Deviation (MAD) | 995 |
| Skewness | 7.399902723 |
| Sum | 913114565 |
| Variance | 42262781.01 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 160 | 1951 | 0.6% |
| 105 | 1875 | 0.6% |
| 110 | 1786 | 0.6% |
| 180 | 1527 | 0.5% |
| 145 | 1362 | 0.4% |
| 185 | 1312 | 0.4% |
| 300 | 1285 | 0.4% |
| 125 | 1271 | 0.4% |
| 165 | 1253 | 0.4% |
| 175 | 1230 | 0.4% |
| Other values (3400) | 244577 | |
| (Missing) | 49688 | 16.1% |
| Value | Count | Frequency (%) |
| 70 | 225 | 0.1% |
| 75 | 75 | < 0.1% |
| 80 | 362 | 0.1% |
| 85 | 918 | |
| 90 | 688 | 0.2% |
| 95 | 691 | 0.2% |
| 100 | 905 | |
| 105 | 1875 | |
| 110 | 1786 | |
| 115 | 1015 |
| Value | Count | Frequency (%) |
| 287010 | 8 | |
| 148910 | 3 | < 0.1% |
| 142860 | 4 | |
| 122150 | 4 | |
| 116765 | 3 | < 0.1% |
| 109725 | 7 | |
| 108570 | 7 | |
| 107655 | 4 | |
| 101270 | 8 | |
| 95460 | 7 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 11 | 991900023 | 60 | 61 | 61280 | 79505 | 78609 | 103007 | NaN |
| 1 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 12 | 991900018 | 3625 | 4016 | 17685 | 22776 | 78609 | 103007 | 440.0 |
| 2 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 12 | 991900006 | 17 | 17 | 17685 | 22776 | 78609 | 103007 | NaN |
| 3 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 11 | 991900024 | 3763 | 4024 | 61280 | 79505 | 78609 | 103007 | 305.0 |
| 4 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 13 | 991900026 | 2 | 3 | 717 | 726 | 78609 | 103007 | 115.0 |
| 5 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 12 | 991900011 | 1250 | 1283 | 17685 | 22776 | 78609 | 103007 | 170.0 |
| 6 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 13 | 991900025 | 20 | 21 | 717 | 726 | 78609 | 103007 | 290.0 |
| 7 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 12 | 991900008 | 2570 | 2644 | 17685 | 22776 | 78609 | 103007 | 460.0 |
| 8 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 11 | 991900020 | 1100 | 1162 | 61280 | 79505 | 78609 | 103007 | 2260.0 |
| 9 | 1.0 | 2022-08-26 | 2022-08-01 | 2019-01-01 | 1900 | 11 | 991900021 | 78 | 80 | 61280 | 79505 | 78609 | 103007 | NaN |
Last rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 309107 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 964 | 29199262 | 7 | 7 | 2502 | 11078 | 1022928 | 1970508 | NaN |
| 309108 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 423 | 19999007 | 9 | 9 | 529 | 622 | 1022928 | 1970508 | 6250.0 |
| 309109 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 434 | 79799010 | 3 | 3 | 70 | 76 | 1022928 | 1970508 | NaN |
| 309110 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 291 | 20112038 | 1 | 1 | 565 | 620 | 1022928 | 1970508 | NaN |
| 309111 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 955 | 29199254 | 1 | 1 | 1065 | 1689 | 1022928 | 1970508 | NaN |
| 309112 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 434 | 79799019 | 2 | 2 | 70 | 76 | 1022928 | 1970508 | 500.0 |
| 309113 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 822 | 20108070 | 1 | 1 | 524 | 2026 | 1022928 | 1970508 | NaN |
| 309114 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 451 | 131999145 | 1 | 1 | 627 | 747 | 1022928 | 1970508 | 4200.0 |
| 309115 | 1.0 | 2022-08-26 | 2022-08-01 | 2017-01-01 | 324 | 104 | 131999063 | 1 | 1 | 1860 | 2828 | 283483 | 491585 | NaN |
| 309116 | 1.0 | 2022-08-26 | 2022-08-01 | 2013-01-01 | 313 | 313 | 39999019 | 1 | 1 | 63 | 83 | 1022928 | 1970508 | NaN |